Exploiting multi-core processors for scientific applications using hybrid MPI-OpenMP
نویسندگان
چکیده
Most current and emerging high-performance systems consist of large numbers of processors set within an architecture with ‘fat’ shared memory nodes supporting tens of threads per node. There are good reasons to adopt a hybrid MPI-OpenMP programming model for large-scale applications on such architectures, but this adds complexity to the parallel program and demands scalability at two levels: MPI across nodes and OpenMP within a node. We present performance and scaling studies for four applications (Fluidity-ICOM, NEMO, PRMAT and a 3D Red-Black Smoother) that use the hybrid MPI-OpenMP programming model. We show that for computations that use a large number of cores the hybrid approach provides a significant improvement to the performance provided that algorithms with minimal synchronisation and suitable libraries are used.
منابع مشابه
HOMPI: A Hybrid Programming Framework for Expressing and Deploying Task-Based Parallelism
This paper presents hompi, a framework for programming and executing task-based parallel applications on clusters of multiprocessors and multi-cores, while providing interoperability with existing programming systems such as mpi and OpenMP. hompi facilitates expressing irregular and adaptive master-worker and divide-and-conquer applications avoiding explicit mpi calls. It also allows hybrid sha...
متن کاملPerformance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers
In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM m...
متن کاملChinese Academy of Science Institute of Computing Technology Key Laboratory of Computer System and Architecture Understanding Parallelism in Graph Traversal on Multi-core Clusters
There is an ever-increasing need for exploring large-scale graph data sets in computational sciences, social networks, and business analytics. However, due to irregular and memory-intensive nature, graph applications are notoriously known for their poor performance on parallel computer systems. In this paper we attempt to present a deep understanding of parallelism in graph traversal on distrib...
متن کاملUsing Hybrid Parallel Programming Techniques for the Computation, Assembly and Solution Stages in Finite Element Codes
The so called “hybrid parallelism paradigm”, that combines programming techniques for architectures with distributed and shared memories using MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) standards, is currently adopted to exploit the growing use of multi-core computers, thus improving the efficiency of codes in such architectures (several multi-core nodes or clustered sym...
متن کاملMixed-mode implementation of PETSc for scalable linear algebra on multi-core processors
With multi-core processors a ubiquitous building block of modern supercomputers, it is now past time to enable applications to embrace these developments in processor design. To achieve exascale performance, applications will need ways of exploiting the new levels of parallelism that are exposed in modern high-performance computers. A typical approach to this is to use shared-memory programming...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015